A Chinese Product Feature Extraction Method Based on KNN Algorithm
نویسندگان
چکیده
The product feature set of online reviews obtained by the current product feature extraction methods has a low coverage rate of review information. In order to solve this problem, this paper proposes a method of product feature extraction based on KNN algorithm. We establish the classification system of product feature set firstly. Then we extract part of product features as training set manually, and according to similarity between words and the classification system, the product features of all reviews are quickly classified and extracted. At last, the PMI algorithm is used to filter and supplement it to improve the correct rate and the review information coverage rate of product feature set. Through the examples of online clothing reviews data in the Taobao platform, we prove that this method can effectively improve the review information coverage rate of product feature set.
منابع مشابه
A Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملAn Improved CHI Feature Selection Method for Chinese Text Classification
We Proposed a kind of feature selection method named ICHI based on improved CHI. Through the classified experiment ,the result showsthat feature extraction effect of CHI method is better than the traditional CHI’s when them is used to select features in SVM and KNN classification, and the ICHI method can enhance theaccuracy in text classification and it is fittedto extract feather.
متن کاملThe Analysis and Optimization of KNN Algorithm Space-Time Efficiency for Chinese Text Categorization
The performance of any algorithm for text classification are reflected in the of reliability classification results and classification algorithm is high efficient. We analyze the space-time efficiency of different stages based on the traditional KNN algorithm process for Chinese text classification and ensure the reliability of classification. And we optimize efficiency of the algorithm and the...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کامل